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ABSTRACT 

It is shown how wavelets may be used to analyse the absorption properties of the 
Lya forest. The Discrete Wavelet Transform of a QSO spectrum is used to decompose 
the light fluctuations that comprise the forest into orthogonal wavelets. It is demon- 
strated that most of the signal is carried by the moderate to lower frequency wavelets 
in high resolution spectra, and that a statistically acceptable description of even high 
signal-to-noise spectra is provided by only a fraction (10-30%) of the wavelets. The 
distributions of the wavelet coefficients provide a statistical basis for discriminating be- 
tween different models of the Lya forest. The method is illustrated using the measured 
spectrum of Q1937-1009. The procedure described is readily automated and may be 
used to process both measured spectra and the large number of spectra generated by 
numerical simulations, permitting a fair comparison between the two. 

Key words: intcrgalactic medium - methods:data analysis - quasars:absorption lines 
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q 1 INTRODUCTION 

H ' 

' Measurements of QSO spectra show that the Intergalactic 
£3 Medium (IGM) is composed of highly inhomogeneous struc- 
JL" ' tures. Ever since their identification by Lynds (1971) and 
, £^ ' the pioneering survey of Sargent et al. (1980), these inho- 
i mogeneities have been described as discrete absorption sys- 
5^ ' tems, the Lya forest. With the view that the systems arise 
from individual intervening gas clouds, the Lya forest has 
been characterized using traditional absorption line statis- 
tics, most notably the line equivalent widths and, as the 
spectra improved in resolution and signal-to-noise ratio, the 
Doppler widths and H i column densities through Voigt pro- 
file line fitting to the features. 

In the past few years, numerical simulations have suc- 
cessfully modelled many of the measured properties of the 
forest, showing that the absorption systems may arise as a 
consequence of cosmological structure formation (Cen et 
al. 1994; Zhang, Anninos & Norman 1995; Hernquist et al. 
1996; Bond & Wadsley 1997; Zhang et al. 1997; Theuns, 
Leonard & Efstathiou 1998). The simulations have shown, 
contrary to the picture in which the systems are isolated 
intergalactic gas clouds, that most of the systems origi- 
nate in an interconnected web of sheets and filaments of 
gas and dark matter (Cen et al. 1994; Bond & Wadsley 
1997; Zhang et al. 1998). Alternative statistical methods 
were subsequently introduced for describing the forest using 
the more direct measurements of the induced light fluctu- 
ations. These include the 1-point distribution of the fluc- 
tuations (Miralda-Escude et al. 1996; Zhang et al. 1997), 
and a quantity related to the 2-point distribution based on a 



weighted difference of the light fluctuations in neighbouring 
wavelength pixels (Miralda-Escude et al. ). A direct esti- 
mate of the 2-point transmission correlation function was 
made by Zuo & Bond (1994). 

While the newer methods for analysing the Lya forest 
avoid the identification of absorption lines and the fitting of 
Voigt profiles, they are not necessarily fundamentally differ- 
ent in their description of the spectra. For instance, Zhang 
et al. (1998) find that the distribution of optical depth per 
pixel in their simulation may be recovered by modelling the 
spectra entirely by discrete absorption lines with Voigt pro- 
files. Rather the more direct methods circumvent a difficulty 
that has long plagued attempts to characterize the absorbers 
in terms of Voigt profiles: the sensitivity of the resulting line 
statistics to noise and to the fitting procedure. Absorption 
line fitting of necessity requires arbitrary decisions to be 
made regarding the setting of the continuum level, the de- 
blending of features, and a decision on the acceptability of a 
fit. Different observational groups report different distribu- 
tions for the line parameters. Most discrepant has been the 
inferred distribution of line widths. Even with the highest 
quality data gathered to date using the Keck HIRES, agree- 
ment is still lacking, with Hu et al. (1995) finding a narrower 
Doppler parameter distribution with a significantly higher 
mean than found by Kirkman & Tytler (1997). The dif- 
ferences are important, as cosmological simulations predict 
comparable differences for a range of plausible cosmological 
models (Machacek et al. 2000; Meiksin et al. 2000). 

The purpose in this paper is to develop a method that 
provides an alternative objective description of the statistics 
of the Lya forest. Ultimately the goal is to employ the same 
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method for analysing both observational data and data de- 
rived from numerical simulations in order to compare the 
two on a fair basis. Because of the large number of synthetic 
spectra generated from a simulation necessary to provide a 
correct average description of the forest, two principal re- 
quirements of the procedure are that it be fast and easily 
automated. Although automated or semi-automated Voigt 
profile fitting procedures exist (AutoVP, Dave et al. 1997; 
VPFIT, developed by Carswell and collaborators), these 
procedures still require arbitrary decisions to be made to 
obtain acceptable fits. The complexity of the codes makes 
it difficult to assess the statistical significance of differences 
between the measured distributions of the absorption line 
parameters and those predicted. The codes also are compu- 
tationally expensive, making very costly their application to 
the large number of simulated spectra required to obtain a 
statistically valid average of the line parameters. For these 
reasons, a faster less complex method would be desirable. 
The Voigt profile fitting codes yield important parameters, 
like the linewidths, which contain physical information (eg, 
gas temperature and turbulent velocities), that the direct- 
analysis methods do not. ft would thus be desirable for an 
alternative method to retain some of this information. The 
method presented here utilizes wavelets to characterize the 
absorption statistics of the Lya forest. It is not intended to 
be a replacement for Voigt profile fitting, but a fast alterna- 
tive that allows a ready comparison between the predictions 
of numerical models and measured spectra and a clear sta- 
tistical analysis of the results. 

The outline of the paper is as follows: in §(2] it is shown 
how the statistics of the Lya forest may be characterized 
using wavelets. In §^ the method is applied to the measured 
spectrum of a high redshift QSO. The results are summa- 
rized in 94L 



2 ANALYSING THE Lya FOREST WITH 
WAVELETS 

2.1 Terminology 

Although wavelets have been used in signal processing, im- 
age analysis, and the study of fluid dynamics for a decade, 
they are only beginning to enter the vernacular of as- 
tronomers. Accessible introductions are provided in Press et 
al. (1992), and in Slezak, Bijaoui & Mars (1990) and Pando 
& Fang (1996), who apply wavelets to study the clustering of 
galaxies and Lya absorbers, respectively. More complete ac- 
counts of wavelet methodology are Chui (1992), Daubechies 
(1992), and Meyer (1993). The description here is confined 
to those elements necessary to introduce the notation and 
terminology that will be used below. 

Wavelets are defined variously in the literature. The def- 
inition of most use here, somewhat restrictive but appropri- 
ate to a multiresolution analysis using the Discrete Wavelet 
Transform (DWT), is (Meyer): 

A wavelet is a square-integrable function ip(x) defined in real 
space such that if}jk = 2^ 2 if) {2^ x — k), where j and k are integers, 
is an orthonormal basis for the set of square-integrable functions. 

The wavelet ip(x) satisfies J_ dx ip(x) — 0, and is generally 
chosen to be concentrated near x = k2~ J . Its defining prop- 
erties permit it to perform two operations governed by the 



values of j and k. Smaller values of j correspond to coarser 
variations in f(x), while differing values of k correspond to 
shifting the centre of the transform. 

The wavelet coefficients of a function f (x) are defined 

by 

w jk = J dx f{x)ip jk {x). (1) 

The set of coefficients {wjk} comprises the wavelet trans- 
form of the function f(x). The function may then be recov- 
ered through the inverse transform 

f{x) =^2w jk ipjk(,x), (2) 

since the set of functions ipjk forms a complete orthonor- 
mal basis. The wavelet coefficients at a level j express the 
changes between the smoothed representations of f(x) at 
the resolution scales j + 1 and j. 

Several functions may serve as wavelets. A set that 
has proven particularly useful was developed by Daubechies 
(Daubechies 1992) . These functions are constructed to have 
vanishing moments up to some value p, and the functions 
themselves vanish outside the range < x < 2p + 1. The 
wavelet coefficients decrease rapidly with p for smooth func- 
tions. Accordingly, the higher order Daubechies wavelets are 
the most suitable for analyzing smooth data. The DWT is 
computed using the pyramidal algorithm as implemented in 
Numerical Recipes (Press et al. ). The Daubechies wavelet 
of order 20 is chosen throughout. 

2.2 Monte Carlo simulations 

The properties of the wavelet transform of the Lya forest are 
examined by performing Monte Carlo realizations of spectra. 
The spectra are constructed from discrete lines with Voigt 
profiles using the H 1 column density and Doppler param- 
eter distributions found by Kirkman & Tytler. Specifically, 
the H 1 column densities Nm are drawn from a power law 
distribution of slope 1.5 between 12.5 < log 10 Nm < 16 
and the Doppler parameters b from a gaussian with mean 
23 kms -1 and standard deviation 14 kms -1 . A cut-off in b 
is imposed according to b > 14 + 4(log 10 Nhj — 12.5) kms" . 
The resulting average Doppler parameter is 31 kms -1 . 
The number density of lines per unit redshift matches that 
of Kirkman & Tytler at 2 = 3. The resolution is set at 
X/dX = 5 x 10 4 , and gaussian noise is added according to a 
specified continuum signal-to-noise ratio per pixel. This is 
the fiducial model used in all the simulations unless stated 
otherwise. Segments 128 pixels wide were found adequate 
for extracting the statistical properties of the wavelet coef- 
ficients. 

A representative spectrum and its discrete wavelet 
transform are shown in Figure [l]. A block at resolution j is 
128/2-' pixels wide and 2 J pixels long for j — 1 to 6. The res- 
olution becomes finer as j increases from 1 to 6 (downwards). 
The uppermost level (j = 0) corresponds to smoothed av- 
erages of the spectrum. The wavelet coefficients tend to in- 
crease in magnitude with decreasing resolution (decreasing 
j). The low values indicate that only small changes occur in 
the spectrum when smoothed at one resolution level to the 
next higher. The small values are desirable, as they signify 
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Figure 1. (a) A representative synthetic spectrum showing the 
Lya forest at z = 3.0 at a resolution of X/d\ = 5 X 10 4 . (b) The 
absolute magnitudes of the wavelet coefficients are shown in the 
grayscale map. The map is linear and ranges between (white) 
and 0.5 (black). 



the dominant absorption features in the spectra are ade- 
quately resolved. 

Because the wavelet functions form a complete set of ba- 
sis functions, the full set of wavelet coefficients completely 
describes the spectrum: the spectrum may be reconstructed 
identically from the inverse transform. For noisy spectra, 
however, it will generally be unnecessary to retain the full 
set of coefficients. Indeed, this is the motivation for multi- 
resolution data compression. By employing a judicious set 
of basis functions, a signal may be compressed into only a 
small fraction of its original size. The method of chosing the 
optimal basis set such that the compressed signal matches 
the original as closely as possible in a least squares sense 
with the least number of retained basis elements is known as 
Proper Orthogonal Decomposition or the Karhunen-Loeve 
procedure (see Berkooz, Holmes & Lumley 1993 for a re- 
view). The basis set, however, will in general differ from 
signal to signal if its components are highly variable, as in 
the case of the Lya forest. Although not optimal in the least 
squares sense, the wavelet basis nonetheless achieves a large 
amount of data compression and has the advantage of gen- 
erality. Next is described how wavelets may be applied to 
assessing the amount of useful information in a spectrum. 

Two measures of the information content of a noisy 
spectrum are considered, one based on \ 2 an d the second on 
entropy. If s(xi) is the original spectrum defined at N points 
Xi (eg, wavelength or velocity), and s n (xi) is the spectrum 
reconstructed from the n largest (in magnitude) wavelet co- 
efficients, then 



N 

E 



s(Xi) 



(3) 



i. For gaussian distributed measurements, the expectation 
value of x 2 is the number of degrees-of-freedom. If n wavelet 
coefficients are retained, the number of degrees-of-freedom 
is N—n. (Hence, for example, x 2 = is expected for n — N.) 
The reduced Xrcd ~ X 2 /(N — n) then defines the optimal 
value of n for truncating the wavelet coefficients. 

The information content may also be expressed in terms 
of the wavelet coefficients directly as an "entropy"r| 



E 

jk 



a% log a 



jk: 



where the out are the normalized coefficients 



OLjk = 



2 

jk Jk 



1/2 ■ 



(4) 



(5) 



where <Ti is the measurement error associated with pixel 



This quantity behaves like a physical entropy in the sense 
that it is maximum when the signal is completely random 
so that the full set of coefficients {wjk} is required to de- 
scribe it, while it vanishes when the signal may be entirely 
described by a single coefficient. 

The reduced \ 2 f° r an ensemble of Monte Carlo re- 
alizations is shown in Fig. ^| as a function of the fraction 
(N — n)/N of the wavelet coefficients discarded. As the 
signal-to-noise ratio per pixel increases, the value of Xred 
for a given n increases. In all cases, however, there is some 
n < N for which x 2 c d = 1- This suggests that an acceptable 
fit to a noisy spectrum may be provided by only a fraction 
n/N of the full set of coefficients, with the fraction required 
increasing as the noise level decreases. 

The entropy S is shown in Figure [| The entropy stays 
nearly constant out to Xrcd = 1> indicating that little infor- 
mation has been lost by discarding the small coefficients. As 
X 2 e d increases, eventually the entropy decreases as informa- 
tion is lost. Due to the greater information content of the 
less noisy spectra, as the noise level is decreased, the entropy 
remains constant to increasingly higher values of x? e d before 
declining. 



2.3 Statistics of the wavelet coefficients 

It was shown above how wavelets may be used to character- 
ize the noise properties of a spectrum. The wavelet coeffi- 
cients, however, may also be used to characterize the statis- 
tics of the Lya forest itself. 

The distributions of the coefficients (in absolute value) 
for the several resolution levels are shown in Fig. ^] for a set 
of simulated spectra with S/N = 50, typical of the Keck 
HIRES spectra. The number of coefficients at a level j is 
2 J , with j — 1 corresponding to the coarsest resolution, and 
j = 6 to the finest for the 2 7 = 128 pixels used in a spectrum. 
The finest resolution (j = 6) curve is the steepest. As the 
resolution becomes increasingly coarse, the amplitude of the 
coefficients increases, as was found in Fig. |l|. This indicates 
that most of the information in the spectrum is carried by 
the coarser levels (as well as by the two course scale aver- 
ages, not shown). The finest level has resolved the spectral 
structures, with little difference between the smoothed rep- 
resentations of the spectrum at resolution levels j = 5 and 



Meyer (1993) defines the entropy to be the exponential of S. 



© 0000 RAS, MNRAS 000, 000-000 



4 A.Meiksin 




0.2 0.4 0.6 0.8 1 
fraction of coefficients discarded 




0.5 1 1.5 2 



wavelet coefficient 



Figure 2. The dependence of the reduced Xr C d on f rac tion 
of discarded wavelet coefficients. The curves increasing from the 
bottom are for signal— to-noise ratios of 10, 30, 50, 100, 300, and 
1000. As the noise level increases, an increasing fraction of the 
coefficients may be discarded with the remainder still providing 
a statistically acceptable fit to the spectra. 
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Figure 3. The entropy of the spectra defined in terms of the 
wavelet coefficients. Little information is lost from the spectra for 
•^red — 1' as measure d by the entropy. The entropy curves from 
left to right are for signal— to-noise ratios of 10, 30, 50, 100, 300, 
and 1000. 



Figure 4. The normalized distribution of the wavelet coefficients 

at the levels j = 1 6. The coefficients increase in magnitude at 

the coarser (lower j) resolutions, indicating that they carry most 
of the information in the spectra. 



6. Applying a cut-off in the coefficients corresponding to 
Xrcd = 1 yields for the average number retained of the ini- 
tially 2 3 coefficients for j = 1 . . . 6 the respective values 1.9, 
3.8, 7.4, 11.6, 6.9, and 3.4. While almost all of the coeffi- 
cients for j < 3 are needed, a decreasing fraction is required 
to describe the spectra at higher resolution. 

The distributions are insensitive to the signal-to-noise 
ratio, as shown in Fig. |f| Except for the lowest ratio of 10, 
the curves coincide, showing that they may be measured ac- 
curately even for a varying signal-to-noise ratio in a spec- 
trum, provided it is not too low. 

To demonstrate that the wavelet coefficient distribu- 
tions may be used to discriminate between different pre- 
dictions for the statistical properties of the Lya forest, a 
second set of Monte Carlo realizations with alternative col- 
umn density and Doppler parameter distributions is gener- 
ated. The parameters adopted are those reported by Hu et 
al. They found that the forest statistics are consistent with 
an H i column density distribution with a slope of 1.5 for 
clouds with 12.3 < log A*hi < 14.5 and a Gaussian Doppler 
parameter distribution with mean 28 kms -1 , standard de- 
viation 10 kms -1 , and a sharp cut-off below 20 kms -1 . 
The resulting average Doppler parameter is 37 kms -1 . The 
simulation is normalized to the same line density per unit 
redshift at z — 3 as found by Hu et al. , but the column den- 
sity distribution is extended to log Ahi = 16 to be consistent 
with the previous set of simulations. The resulting wavelet 
coefficient distributions are compared with those from the 
previous simulation for j — 3, 4, and 5 in Fig. [| The wavelet 
coefficients are able to distinguish between the two models. 

The Kolmogorov-Smirnov test may be used to assess 
the probability that the wavelet coefficients of a measured 
spectrum match a given distribution for each resolution level 
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j. The most stringent test, however, is given by combining 
the probabilities for all the distributions. Because any given 
absorption feature may be expected to affect the coefficients 
at more than a single resolution level j, it is possible that the 
coefficients corresponding to a given set of nested blocks for 
different j (see Fig. [j]) may be correlated. In this case, the 
probabilities of matching the various distributions may not 
be combined as if they were independent. To determine the 
degree to which the distributions may be treated as indepen- 
dent, the correlations are measured for coefficients between 
the various levels j corresponding to the same hierarchy of 
blocks, and then averaged over all the hierarchies, for a set 
of Monte Carlo realizations using the fiducial forest model. 
The results are shown in Table m (The level j — refers 
to the correlations with the pair of coefficients correspond- 
ing to the course scale averages.) A signal-to-noise ratio of 
50 is assumed, and a cut-off in the coefficients is applied to 
ensure Xicd ~ !• The error on the correlations is ~ 0.1%. Al- 
though the correlations are small, they are not absent. They 
are sufficiently small, however, that treating the probabili- 
ties for the different distributions as independent should be 
an adequate approximation for model testing. 



Figure 5. The normalized distribution of the wavelet coefficients 
for the levels j = 3, 4, and 5 for signal-to— noise ratios of 10, 30, 
50, 100, 300, and 1000. Except for S/N = 10 (dotted lines), the 
distributions overlap. 
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2.4 Data compression 

One of the key features of wavelets is their ability to com- 
press data. Figs. |2| and ^ show that it is possible to fit a 
spectrum using only a subset of the wavelets used in its 
DWT at a statistically acceptable level (x? c d — l)i without 
significantly degrading the information content of the spec- 
trum as measured by the wavelet entropy. This suggests that 
filtering the spectrum in this way may provide a usable spec- 
trum that is relatively noiseless and suitable for absorption 
line fitting. 

This is illustrated by performing Voigt profile fitting 
to Monte Carlo realisations of the fiducial line model, with 
an assumed signal-to-noise ratio of 50. A wavelet filtered 
representation of each realised spectrum is generated with 
coefficients truncated to give a reduced Xicd = 1 f° r t ne 
difference between the original and wavelet filtered spectra. 
This corresponds on average to retaining only 30% of the 
full set of coefficients. A representative spectrum is shown 
in Fig. |. 

Absorption lines are then identified in the filtered spec- 
trum and fit using AutoVP. The results of 10 4 realisations 
are shown in Figs. ^| and [)| Also shown are the distributions 
obtained from AutoVP using the original spectra with no 
wavelet filtering applied. The distributions are nearly iden- 
tical. A negligible loss is incurred in the recovery of the line 
parameters despite the exclusion of 70% of the information 
in the original spectra. 



Figure 6. The normalized frequencies of the wavelet coefficients 
for the levels j = 3, 4, and 5 for two different statistical descrip- 
tions of the Lya forest. The solid lines are based on the Voigt 
parameter distributions inferred by Kirkman & Tytler, and the 
dashed on the distributions inferred by Hu ct al. A signal-to— 
noise ratio of 50 is used in both sets of simulations. The distribu- 
tions of wavelet coefficients distinguish between the two models. 



3 APPLICATION TO Q1937-1009 

In this section, the Discrete Wavelet Transform is used to 
analyse the Lya forest as measured in the z = 3.806 QSO 
Q1937-1009. The spectrum was taken with the Keck HIRES 
at a resolution of ~ 8.5 kms" 1 (Buries & Tytler 1997). The 
signal-to-noise ratio per pixel was ~ 50. The spectrum cov- 
ers the range between Lya and Ly/3 in the QSO restframe. 
(The region analysed is restricted to the redshift interval 
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Table 1. Wavelet coefficient correlation matrix for resolution levels j = 6, . . . , 0. 
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Figure 7. Synthetic spectrum with S/N = 50 (heavy solid his- 
togram). The wavelet filtered spectrum with xj?cd = 1 ' s shown 
by the lighter line (shown as a smooth curve for clarity). 



3.055 < z < 3.726 to avoid any possible influence by the 
QSO.) 

The distribution of wavelet coefficients is shown in 
Fig. for j = 2, 3, 4, and 5. As in Fig. ^| the high frequency 
coefficients are generally smaller than at lower frequencies, 
indicating that the fluctuations that dominate the spectra 
have been resolved. 

The cumulative distributions of the coefficients are com- 
pared with the predicted distributions for the fiducial model 
in Fig. [y]. The predicted distributions were generated by 
simulating spectra with the same pixelization, resolution, 
signal-to-noise ratio and wavelength coverage as for the 
measured spectrum of Q1937-1009. An increase in line den- 
sity per unit redshift proportional to (1 + z) 2,6 (Kirkman 
& Tytler) was included to match to the redshift range of 
Q1937-1009. While the distributions generally agree well, a 
large variation is found for j — 4, corresponding to fluctua- 
tions on the scale of 17 — 34 kms" 1 , suggesting some differ- 
ences from the line model of Kirkman & Tytler. Effects ne- 
glected in the simulations that could produce a difference are 
the presence of metal systems and redshift correlations be- 
tween the Lycv absorption systems. The changes that would 



Figure 8. The recovered H I column density distribution from 
a set of Monte Carlo realizations. The dotted curve corresponds 
to the Voigt fits obtained using the original unfiltered spectra. 
The solid curve shows the recovered distribution obtained from 
the wavelet filtered spectrum for which only 30% of the wavelet 
coefficients are retained, corresponding to a reduced X 2 e d = 1 f° r 
the difference between the original and wavelet filtered spectra. 
The heavy solid line shows the input model distribution. 

be produced, however, are most likely small: the number of 
metal systems is small, and the correlations appear weak or 
absent (Meiksin & Bouchet 1995; Kim et al. 1997). Still, the 
sensitivity of the wavelet coefficient distributions to these ef- 
fects may be worth more careful consideration. 



4 SUMMARY 

Wavelets may be usefully employed to provide a statis- 
tical characterizaton of the absorption properties of the 
Lycv forest. An approach is presented that performs a mul- 
tiresolution analysis of the forest using the Discrete Wavelet 
Transform of the QSO spectrum. The transform decomposes 
the local frequency dependence of the light fluctuations into 
an orthogonal hierarchy of basis functions, the wavelets. It 
is found that in spectra of better than 10 kms -1 resolution, 
most of the information of the spectrum is carried by the 
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Figure 9. The recovered Doppler parameter distribution, as in 
Fig. fel The heavy solid line shows the input model distribution. 



Figure 11. The cumulative distributions of the wavelet coeffi- 
cients for the spectrum of Q1937-1009, along with the predicted 
distributions according to the line model of Kirkman & Tytler 
(1997). The frequency levels shown are j = 2 (dot— dashed), j = 3 
(long-dashed), j ' = 4 (solid), and j = 5 (short— dashed). 
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Figure 10. The normalized distributions of the wavelet coef- 
ficients for the spectrum of Q1937— 1009. The distributions are 
shown for the levels j = 2 (dot-dashed), j = 3 (long-dashed), 
j = 4 (solid), and j = 5 (short-dashed). The faster decline 
at higher frequencies demonstrates that the absorption features 
dominating the spectrum have been adequately resolved. 



lower frequency wavelets. For a signal-to-noise ratio typical 
of even the highest quality spectra (S/N = 10 — 100), only 
10-30% of the wavelets are required to provide a statisti- 
cally acceptable description of the spectrum, corresponding 
to a data compression factor of 3-10. It is shown that a 
Voigt profile line analysis performed on the wavelet filtered 
spectra yields nearly identical line parameter distributions 
as obtained from the original unfiltered spectra. 

The distributions of the wavelet coefficients offer an al- 
ternative statistical description of the Lycv forest while re- 
taining information on the line widths. It is demonstrated 
that the correlations of coefficients between different levels in 
the wavelet hierarchy are weak (a few percent or smaller). 
Consequently, each of the distributions may be treated as 
statistically independent to good approximation. 

The method is applied to a Keck HIRES spectrum of 
Q1937-1009. The wavelet coefficient distributions behave 
qualitatively similarly to those found in Monte Carlo sim- 
ulations based on the line parameter distributions reported 
by Kirkman & Tytler. The measured distributions, however, 
show some differences on the scale 17 — 34 kms -1 . 

The results demonstrate that Multiresolution Analysis 
using the Discrete Wavelet Transform provides an alterna- 
tive objective, easily automated procedure for analysing the 
Lya forest suitable for basing a comparison between the 
measured properties of the Lya forest and the predictions 
of numerical models. 



The author thanks S. Buries and D. Tytler for kindly 
providing the spectrum of Q1937-1009, and R. Dave for per- 
mission to use AutoVP. 
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